Variable Selection in Normal Mixture Model Based Clustering under Heteroscedasticity
نویسندگان
چکیده
منابع مشابه
Variable Selection for Model-Based Clustering
We consider the problem of variable or feature selection for model-based clustering. We recast the problem of comparing two nested subsets of variables as a model comparison problem, and address it using approximate Bayes factors. We develop a greedy search algorithm for finding a local optimum in model space. The resulting method selects variables (or features), the number of clusters, and the...
متن کاملVariable selection in clustering via Dirichlet process mixture models
The increased collection of high-dimensional data in various fields has raised a strong interest in clustering algorithms and variable selection procedures. In this paper, we propose a model-based method that addresses the two problems simultaneously. We introduce a latent binary vector to identify discriminating variables and use Dirichlet process mixture models to define the cluster structure...
متن کاملBayesian Regularization for Normal Mixture Estimation and Model-Based Clustering
Normal mixture models are widely used for statistical modeling of data, including cluster analysis. However maximum likelihood estimation (MLE) for normal mixtures using the EM algorithm may fail as the result of singularities or degeneracies. To avoid this, we propose replacing the MLE by a maximum a posteriori (MAP) estimator, also found by the EM algorithm. For choosing the number of compone...
متن کاملFinite mixture regression: a sparse variable selection by model selection for clustering
HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau...
متن کاملVariable selection in model-based clustering using multilocus genotype data
We propose a variable selection procedure in model-based clustering multilocus genotype data. Indeed, it may happen that some loci are not relevant for clustering into statistically different populations. Inferring the number K of clusters and the relevant clustering subset S of loci is regarded as a model selection problem. The competing models are compared using penalized maximum likelihood c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Korean Journal of Applied Statistics
سال: 2011
ISSN: 1225-066X
DOI: 10.5351/kjas.2011.24.6.1213